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INTRODUCTION 


The  following  is  a  description  of  the  desired  capabilities  of  a  system  which,  when 
placed  in  a  distant  and  potentially  hostile  environment,  would  provide  a  human  operator  with 
an  accurate  representation  of  the  sound  field  in  that  environment.  Optimally,  this  system 
would  provide  the  human  with  auditory  sensory  inputs  equivalent  to  those  which  he  would 
actually  experience  in  the  foreign  environment.  In  addition,  it  would  provide  extensions  of 
human  sensory  capability  including  the  following:  transmission  and  display  of  ultrasonic  and 
infrasonic  information,  an  active  sonar  system  which  would  utilize  and  extend  present  human 
echolocation  capabilities,  and  the  ability  to  hear  and  accurately  localize  sounds  in  underwater 
environments. 

A  great  deal  of  research  has  been  done  in  attempts  to  reproduce  human  hearing  and 
localization  abilities  with  a  mechanical  system.  These  investigations  have  been  carried  out 
primarily  by  two  groups:  members  of  the  recording  industry  who  are  interested  in  providing 
listeners  with  spatial  displays  of  music,  and  designers  of  hearing  aids  who  are  interested  in 
development  of  dummy  heads  for  testing  and  evaluation.  Monaural  hearing  and  localization 
has  been  demonstrated  with  artificial  pinnae  (Batteau,  1967;  Shaw,  1979),  and  binaural 
localization  has  been  accomplished  through  the  use  of  dummy  heads  (Burkhard  and  Sachs, 
1975;  Kuhn,  1979A,  1979B).  The  designers  of  dummy  heads  have  sought  to  replicate  binaural 
intensity  differences  resulting  from  head  baffle  and  shadow  effects,  and  binaural  phase  differ¬ 
ences  resulting  from  path  length  differences  between  the  ears. 

Several  attempts  have  been  made  to  replicate  binaural  hearing  by  replacing  the  head 
and  pinna  structures  with  models  of  various  shapes  and  materials  (Wonsdronk.  1959;  Wiener. 
1947A,  1947B).  The  performance  of  such  models  is  necessarily  suboptimal,  and  they  have 
met  with  varying  degrees  of  success,  depending  on  their  applications.  Variation  of  some  head 
and  pinna  parameters  produces  only  second-order  effects  on  localization,  whereas  variation  in 
others  produces  more  marked  effects.  Certainly,  the  more  closely  a  model  resembles  the 
human  head  and  ears,  the  more  realistic  will  be; the  operator's  perception  of  the  sound  field. 
However,  many  parameters  need  not  be  duplicated  in  order  to  provide  accurate  and  unambig¬ 
uous  sound  localization,  given  that  time  is  available  for  training  the  operator  with  the  sub- 
optimal  system,  and  he  is  able  to  learn  new  localization  cues.  That  is,  he  must  be  able  to  learn 
that  sounds  which  he  perceives  at  a  given  angle  may  actually  be  shifted  by  some  constant 
amount. 

Many  investigators  have  also  examined  signal  design  and  array  design  characteristics 
for  active  sonars.  Most  of  this  work  has  simply  involved  target  detection,  but  investigations 
into  target  identification  and  localization  have  been  undertaken  in  the  interest  of  developing 
a  sonar  mobility  aid  for  the  blind.  Considerable  recent  work  has  demonstrated  the  object  dis¬ 
crimination  ability  of  divers  using  portable  underwater  sonar  systems. 

The  several  sections  of  this  document  describe  parameters  of  a  system  to  transfer 
auditory  sensory  data  to  an  operator,  and  the  effects  which  might  be  expected  if  these  para¬ 
meters  are  varied.  First  is  a  description  of  monaural  hearing  and  localization  and  related  design 
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of  an  artificial  pinna.  Next  is  a  presentation  of  information  about  binaural  localization  cues 
and  a  discussion  of  their  relative  saliency  as  a  function  of  various  dummy  head  parameters. 
This  is  followed  by  a  brief  discussion  of  the  design  of  a  system  for  passive  hearing  and  local¬ 
ization  under  water.  The  final  two  sections  discuss  implementation  of  active  sonar  for  echo- 
location.  Vast  differences  exist  in  echo  discrimination  capabilities  in  air  and  water,  due  to 
impedance  mismatching  between  the  target  and  the  medium.  Thus,  the  first  implementation 
section  presents  information  on  what  is  achievable  in  terms  of  obstacle  detection  and  dis¬ 
crimination  by  ear  without  the  aid  of  extra  equipment.  This  form  of  sonar  is  quite  limited, 
although  with  practice  most  people  can  at  least  avoid  obstacles.  The  second  implementation 
section  discusses  various  aspects  of  signal  design  for  active  sonar.  The  type  of  signal  and  beam 
pattern  which  is  most  desirable  depends  on  the  types  of  discriminations  being  attempted. 


MONAURAL  LOCALIZATION 


Monaural  localization  is  accomplished  primarily  by  the  pinna,  and  to  some  extent  by 
reflections  from  the  torso  (Butler,  1969;  Kuhn,  1979B;  Batteau,  1962,  1967),  Localization 
in  the  vertical  plane  is  almost  exclusively  monaural  (Butler,  1969),  and  monaural  cues  con¬ 
tribute  to  localization  in  the  azimuthal  plane,  although  accurate  localization  in  azimuth  re¬ 
quires  a  combination  of  monaural  and  binaural  cues.  Localization  of  sound  in  the  azimuthal 
plane  is  possible  monaurally,  but  performance  is  severely  degraded  from  that  possible  with 
two  ears  (Butler,  1969). 

Essentially,  the  pinna  acts  as  an  antenna  whose  directivity  pattern  is  extremely 
frequency-dependent.  Localization  is  accomplished  via  time  delays  between  direct  and  pinna- 
reflected  sound  (Wright,  Hebrank,  and  Wilson,  1974),  with  the  delays  being  a  function  both 
of  frequency  and  the  relative  orientation  of  observer  and  sound  source.  The  pinna  is  a  very 
poor  sound  collector  at  low  frequencies  (Shaw,  1979).  and  above  5  kHz,  localization  is  largely 
the  result  of  high-order  transverse  modes  within  the  pinna  (Kuhn,  1979A,  1979B).  The  shape 
of  the  pinna  is  ideally  suited  for  localizing  broadband  signals.  Components  of  a  complex  sound 
arriving  at  different  parts  of  the  pinna  take  paths  of  different  lengths  by  being  multiply  reflec¬ 
ted  to  reach  the  canal  entrance  (Konishi,  1976).  Batteau  (1967)  has  described  localization  as 
resulting  from  differences  in  sound  arrival  times  at  the  eardrum.  He  has  shown  that  a  unique 
set  of  arrival  times  is  possible  for  each  set  of  spatial  coordinates.  Since  the  sound  pressure 
transformation  from  the  ear  canal  entrance  to  the  eardrum  is  independent  of  the  sound  direc¬ 
tion  (Wiener  and  Ross,  1946;  Shaw  and  Teranishi,  1968),  the  arrival  time  differences  must 
result  from  multiple  reflections  by  the  complex  pinna  surfaces.  Therefore,  in  designing  an 
artificial  pinna,  it  is  essential  that  the  pinna  contours  and  size  be  very  accurate. 

Monaural  localization  can  be  simulated  quite  effectively  by  mounting  a  small  micro¬ 
phone  at  the  canal  entrance  of  an  artificial  pinna  and  presenting  the  signals  to  an  operator 
wearing  headphones.  As  mentioned  earlier,  the  directivity  pattern  of  the  pinna,  and  conse¬ 
quently  its  size  and  shape,  are  extremely  critical  for  accurate  localization.  In  the  design  of  an 
artificial  auditorysystem.it  would  be  desirable  to  allow  for  interchangeable  pinnae  (Burkhard 
and  Sachs,  1975).  Given  such  a  system,  a  model  could  be  made  of  each  operator's  pinna  so  as 
to  minimize  the  number  of  new  localization  cues  he  must  learn.  Sizes  and  shapes  of  pinnae 
differ  greatly  among  individuals  (Burkhard  and  Sachs.  1975),  and  obviously,  each  person  has 
learned  to  localize  sounds  through  his  own  ears.  If  the  system  were  designed  with  a  standard 
“average”  pinna,  it  is  probable  that  unambiguous  localization  would  still  be  possible.  It  seems 
plausible  that  operators  could  learn  the  localization  cues  associated  with  a  new  pinna,  but  the 
evidence  concerning  this  point  is  inconclusive.  In  one  study  summarized  by  Trahiotis  and 
Robinson  (1979),  subjects  were  presented  with  binaural  recordings  made  via  miniature  micro¬ 
phones  mounted  in  the  ears  of  other  subjects.  One  observer  who  performed  poorly  on  a  local¬ 
ization  task  using  his  own  pinna,  did  quite  well  using  recordings  from  the  pinna  of  another 
subject.  Other  subjects  were  also  unable  to  localize  with  the  stimuli  recorded  in  the  pinna  of 
the  first  observer.  This  suggests  two  major  results.  First,  the  pinnae  of  some  people  are  poor¬ 
ly  suited  to  localization,  and  secondly,  it  is  possible  for  an  observer  to  learn  new  localization 
cues  associated  with  a  different  pinna. 


Since  multiple  reflections  in  the  pinna  are  so  critical  to  localization,  it  seems  that  far 
too  little  attention  has  been  given  in  the  literature  to  the  choice  of  materials  for  constructing 
artificial  pinnae.  Most  investigators  have  used  either  rubber  or  plastic  (Burkhard  and  Sachs, 
1975;  Shaw,  1968;  Batteau,  1963).  The  effect  which  another,  possibly  more  reflective,  mate¬ 
rial  would  have  on  pinna  directivity  is  not  known  to  this  author. 

In  order  to  replicate  human  hearing,  the  microphone  mounted  in  the  pinna  should 
optimally  have  a  frequency  response  from  20  Hz  to  20  kHz,  and  a  dynamic  range  from  0-130 
dB  re  20  pPa.  Of  course,  reduction  of  the  frequency  response  to  a  narrower  band,  or  its  exten¬ 
sion  to  ultrasonic  and  infrasonic  regions  with  appropriate  methods  of  display,  may  be  desir¬ 
able,  depending  on  the  sounds  which  must  be  received.  For  example,  if  the  system’s  only  pur¬ 
pose  were  to  transmit  speech  signals,  a  frequency  response  from  300  Hz  to  5  kHz,  slightly 
exceeding  that  in  most  telephones,  would  probably  be  sufficient.  The  microphone  should  be 
no  larger  than  one  half  inch  in  diameter,  and  because  of  its  small  size  and  location  within  the 
highly  directional  pinna,  the  microphone  directivity  pattern  is  not  critical  (Lybarger  and 
Barron.  1965). 

As  mentioned  earlier,  the  best  reproduction  of  the  sound  field  would  be  achieved 
with  the  microphone  mounted  at  the  entrance  to  the  ear  canal  of  the  artificial  head,  and  with 
the  operator  wearing  headphones.  The  transfer  function  of  the  ear  canal  is  effectively  inde¬ 
pendent  of  orientation  with  respect  to  the  sound  source  (Wiener  and  Ross,  1946;  Shaw  and 
Teranishi,  1968).  However,  the  transfer  function  is  extremely  frequency  dependent,  tending 
to  amplify  frequencies  in  the  1-5  kHz  range.  The  most  accurate  reproduction  will  be  obtained 
if  the  operator’s  own  canal  performs  this  frequency-dependent  amplification,  i.e.,  it  would  be 
undesirable  to  duplicate  canal  resonances  by  mounting  the  microphone  at  the  drum  location. 

The  headphones  worn  by  the  operator  should  reproduce  the  sound  field  at  the  same 
relative  location  where  it  was  picked  up,  i.e.,  at  the  canal  entrance.  Damaske  (1971)  has 
attempted  to  reproduce  via  loudspeakers  the  sounds  picked  up  by  a  dummy  head.  Accurate 
localization  is  not  achieved  in  these  circumstances  because  the  sound  which  is  initially  received 
within  the  ear  is  reproduced  into  the  entire  room.  The  desired  modes  are  not  excited,  nor  will 
the  directivity  pattern  at  the  ear  be  accurate  since  it  will  be  modified  by  the  acoustics  of  the 
listening  room,  by  the  beam  pattern  of  the  loudspeakers,  and  by  the  distance  and  orientation 
of  the  loudspeakers  with  respect  to  the  ears. 
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BINAURAL  LOCALIZATION  AND  DUMMY  HEAD  DESIGN 


Determination  of  the  direction  of  a  sound  source  is  greatly  enhanced  by  the  use  of 
binaural  cues  in  conjunction  with  the  monaural  cues  discussed  in  the  previous  section.  Verti¬ 
cal  localization,  which  uses  mainly  monaural  cues,  is  much  less  accurate  than  is  localization 
in  azimuth  which  employs  both  monaural  and  binaural  cues.  The  advantages  gained  by  two 
ears  separated  by  an  obstacle,  the  head,  are  most  apparent  when  discriminating  between 
closely  spaced  sources  which  are  narrowband  and  of  high  frequency  (Butler,  1969). 

It  is  well  known  that  people  can  localize  broadband  sources  more  easily  than  narrow- 
band  sources  (Konishi,  1976).  Altes  (1978)  has  shown  on  theoretical  grounds  that  accurate, 
unambiguous  localization  can  be  accomplished  in  both  azimuth  and  elevation  using  only  two 
ears,  given  the  following  conditions: 

1 .  The  pinna  has  a  sufficiently  directional  beam  pattern. 

2.  The  two  ears  are  acoustically  isolated  from  one  another. 

3.  The  width  of  the  signal’s  autocorrelation  function  is  narrower  than  the  inter¬ 
pinna  spacing,  i.e.,  the  signal  is  sufficiently  broadband. 

Isolation  and  interpinna  spacing  are  a  function  of  the  composition  and  size  of  the 
head.  When  designing  a  dummy  head,  these  parameters  are  often  altered  from  those  of  a  real 
human  head.  The  effects  of  such  alterations  on  localization  will  be  discussed  later  in  this 
section. 


Localization  at  low  frequencies  is  primarily  the  result  of  time  or  phase  differences  at 
the  two  ears.  Unless  a  sound  source  is  located  in  the  median  plane  of  the  head,  the  path 
lengths  to  the  ears  will  be  different.  While  it  is  generally  true  that  the  auditory  system  is  in¬ 
sensitive  to  absolute  phase,  e.g.,  identification  of  a  tone  as  either  a  sine  or  cosine  wave  with 
no  reference  for  comparison,  it  is  quite  sensitive  to  temporal  shifts  in  the  phase  of  complex 
stimuli  or  to  phase  differences  in  dichotic  stimuli.  In  such  cases,  ample  reference  information 
is  available. 

The  difference  in  path  lengths  between  a  sound  source  and  each  of  the  two  ears  is  a 
function  of  the  orientation  of  the  observer  and  sound  source  and  of  the  distance  between  the 
ears,  approximately  19  cm  for  the  average  human  (Konishi,  1976).  Interaural  phase  differ¬ 
ences  are  valid  localization  cues  only  at  low  frequencies.  If  the  half  wavelength  of  a  sound  is 
shorter  than  the  distance  between  the  ears  a  binaural  phase  difference  of  more  than  180 
degrees  results  and  it  is  impossible  to  tell  which  ear  is  in  the  leading  phase  (Konishi,  1976). 
For  an  interpinna  distance  of  19  cm,  the  upper  frequency  limit  for  binaural  phase  comparison 
in  air  is  approximately  900  Hz.  Kuhn  (1977)  has  measured  1400-1500  Hz  as  the  frequency 
above  which  subjects  switch  from  interaural  time  difference  to  interaural  intensity  difference 
as  a  localization  cue.  Interaural  phase  difference  can  be  transferred  into  interaural  time 
difference  since  the  difference  in  path  lengths  between  the  two  ears  is  given  by  D  sin  (theta), 
and  the  interaural  time  difference  is(D/c)sin  (theta),  where  D  is  the  distance  between  the  ears, 
theta  is  the  azimuthal  angle  of  the  sound  source  with  respect  to  the  median  plane,  and  c  is  the 
speed  of  sound  in  air. 
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At  high  frequencies,  localization  is  governed  by  differences  in  intensity  at  the  two 
ears.  If  the  wavelength  of  sound  is  comparable  to  or  smaller  than  the  head  diameter,  the  head 
acts  as  a  baffle.  The  intensity  at  the  ear  nearest  the  sound  source  is  increased  relative  to  the 
free  field  intensity.  Likewise,  the  shadow  effect  of  the  head  decreases  the  intensity  at  the 
other  ear.  Mills  (1958)  has  measured  the  threshold  for  interaural  intensity  differences  under 
optimum  conditions  to  be  as  small  as  0.5  dB.  Head  diffraction  effects  can  be  as  large  as  ±  14 
dB  (Olson  and  Carhart,  1975;Shaw,  1974). 

Many  investigators  have  measured  the  obstacle  effect  of  the  head  either  to  determine 
optimum  microphone  placement  for  hearing  aids  or  to  create  binaural  effects  in  sound  record¬ 
ing.  The  most  obvious  conclusion  of  these  studies  is  that  binaural  intensity  differences  caused 
by  head  baffle  and  shadow  effects  increase  as  a  function  of  frequency.  Localization  is  there¬ 
fore  more  accurate  at  high  frequencies. 

The  obstacle  effect  of  the  head  is  a  function  of  the  head  size  and  shape  and  the  imped¬ 
ances  of  the  materials  of  which  the  head  is  composed.  The  extent  to  which  these  parameters 
are  critical  in  the  design  of  a  dummy  head  depends  greatly  on  the  purpose  for  which  it  will  be 
used.  Kuhn  ( 1972A),  for  example,  has  found  that  the  head’s  obstacle  effect  can  be  closely 
approximated  by  a  rigid  sphere  with  the  same  cross  section  as  that  of  the  head.  Olson  and 
Carhart  (1975)  found  that  head  baffle  effects  were  somewhat  greater  in  front  than  in  back, 
whereas  shadow  effects  were  greater  in  back.  This  would,  of  course,  not  be  true  for  a  sphere. 
Wiener  (1947 A,  1947B)  points  out  similarities  between  the  diffraction  patterns  around  the 
head  and  those  around  spheres  and  cylinders.  Wonsdronk  (1959),  on  the  other  hand,  found 
very  poor  agreement  between  diffraction  patterns  around  the  head  as  compared  with  other 
shapes.  Wonsdronk  also  obtained  poor  results  when  measuring  diffraction  patterns  around  a 
plaster  cast  of  a  human  head.  He  attributed  this  poor  agreement  to  differences  in  surface  im¬ 
pedance  between  the  real  and  artificial  heads,  an  idea  which  has  not  been  supported  by  more 
recent  evidence  (Burkhard  and  Sachs,  1975;  Kuhn,  1979A).  Burkhard  and  Sachs  attribute  the 
failure  of  Wonsdronk’s  plaster  model  to  the  fact  that  it  was  mounted  on  a  highly  absorbent 
medium  to  serve  as  a  torso.  They  point  out  that  the  human  torso  is  highly  reflective,  conclud¬ 
ing  that  the  plaster  head  would  have  given  more  accurate  diffraction  patterns  were  it  mounted 
on  a  reflective  object.  The  effect  of  torso  reflections  on  localization  will  be  discussed  later  in 
this  section. 

The  following  paragraphs  discuss  the  design  of  a  binaural  localization  system  and  the 
predicted  effects  of  changing  various  parameters  from  those  of  a  human  head.  If  the  purpose 
of  a  binaural  localization  system  is  to  transmit  the  sound  field  to  an  operator  without  modi¬ 
fication,  i.e.,  to  create  the  exact  acoustic  impression  of  being  present  in  the  foreign  environ¬ 
ment,  it  would  be  necessary  to  completely  replicate  the  human  head,  pinnae  and  torso  in  all 
respects.  However,  many  head  parameters  are  not  critical  acoustically,  and  the  effect  of  chang¬ 
ing  these  may  not  even  be  noticeable.  Furthermore,  while  changes  in  other  design  parameters 
will  produce  noticeable  effects,  many  deoptimizations  will  still  allow  unambiguous  localiza¬ 
tion,  given  that  the  operator  can  learn  new  localization  cues. 
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The  parameters  listed  below  are  believed  to  be  most  critical  in  the  design  of  a  binaural 
localization  system. 

1.  Pinna  shape  and  size.  As  mentioned  previously,  it  may  be  desirable  to  design  a 
head  with  interchangeable  pinnae,  such  that  each  operator  could  use  the  system  with  models 
of  his  own  pinnae.  The  effect  of  the  pinna  as  an  obstacle  for  creating  shadow  effects  is  insig¬ 
nificant  in  the  frequency  range  of  interest  (Temby,  1965). 

2.  Spacing  between  pinnae  -  head  size. 

3.  The  presence  of  an  obstacle  between  the  ears  to  create  diffraction  patterns. 
The  shape  of  the  obstacle  will  be  discussed  in  a  later  paragraph. 

4.  Some  degree  of  acoustic  isolation  between  the  pinnae.  The  impedance  mismatch 
between  the  obstacle  and  the  air  will  probably  provide  sufficient  acoustic  isolation.  This  may 
not  be  true  for  underwater  localization,  where  the  impedances  of  the  medium  and  the  head 
are  much  more  closely  matched. 

5.  The  presence  of  a  reflecting  surface  to  simulate  a  torso  below  the  dummy  head. 
Torso  reflections  are  particularly  critical  to  elevation  localization  (Kuhn.  1979b).  Specific 
effects  of  torso  parameters  have  been  investigated  by  Burkhard  and  Sachs  ( 1975),  and  by 
Kuhn  (1977,  1979A,  1979B).  However,  for  accurate  elevation  localization,  possibly  with 
new  cues,  it  is  probably  only  necessary  that  a  reflecting  surface  of  some  arbitrary  shape  be 
present  below  the  head. 

The  parameters  listed  below  are  least  critical  to  dummy  head  design,  i.e..  they  pro¬ 
duce  only  second  or  higher  order  effects  in  diffraction  patterns  below  10  kHz: 

1 .  Contents  of  the  head  cavity. 

2.  Skin  or  surface  impedance  of  the  head  (Burkhard  and  Sachs.  1975;  Kuhn. 
1979A). 

3.  Hair  or  hair  style  (Wonsdronk,  1959). 

4.  Facial  features  or  other  geometric  fine  structure  (Kuhn.  1979A).  Similar  to  the 
pinnae,  the  facial  features  are  too  small  when  compared  to  wavelengths  of  sound  to  cause 
much  variation  in  head  diffraction. 

5.  Material  of  which  the  head  is  made  (Kuhn,  1 979A).  As  long  as  the  head  surface 
is  acoustically  reflective,  it  will  serve  as  a  baffle  in  air.  Its  impedance  reflective,  it  will  serve  as 
baffle  in  air.  Its  impedance  differs  sufficiently  from  air  that  the  surface  may  be  regarded 

as  having  infinite  acoustic  impedance.  This  would  not  be  true  in  water,  however,  since  the 
acoustic  impedance  of  water  is  greater  than  that  of  air  by  a  factor  of  3000  (Kinsler  and  Frey. 
1962).  Most  hearing  under  water  is  accomplished  via  bone  conduction  (Hollien.  1973).  with 
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very  poor  localization,  and  the  material  of  which  an  artificial  head  is  composed  becomes 
more  critical.  Thus,  a  material  which  is  more  acoustically  opaque  than  that  of  the  human 
head  must  be  used. 


As  mentioned  earlier,  intensity  differences  at  the  two  ears  require  the  presence  of  an 
obstacle  to  produce  baffle  and  shadow  effects.  An  obstacle  of  virtually  any  shape  will  pro¬ 
duce  such  intensity  differences  (Wiener,  1947A.  1947B;  Wonsdronk,  1959).  Since  the  audi¬ 
tory  system  is  sensitive  to  interaural  intensity  differences  as  small  as  0.5  dB  (Mills.  1958),  it 
seems  that  the  choice  of  obstacle  shapes  would  not  be  critical  to  high  frequency  localization. 
However,  changes  in  head  shape  will  affect  the  azimuthal  dependence  of  interaural  intensity 
differences.  When  the  head  is  replaced  with  a  rigid  sphere,  large  interaural  intensity  differences 
exist  when  the  sound  source  is  located  to  either  side  of  the  head  (Kuhn,  1979 A).  However, 
with  a  sphere  or  any  obstacle  which  has  symmetry  between  the  front  and  back  halves,  front- 
back  localization  errors  will  occur  for  sound  sources  located  in  the  median  plane  (Canby. 
1977).  Discriminations  between  front  and  back  are  ambiguous  in  such  cases  because  the 
sound  field  will  be  identical  whether  the  sound  source  is  in  front  or  in  back  of  the  head.  This 
is  not  true  for  a  real  head,  where  differences  in  baffle  and  shadow  effects  have  been  noted 
between  front  and  back  (Temby,  1965:  Olson  and  Carhart.  1975). 

In  the  opinion  of  this  writer,  an  artificial  head  made  from  an  obstacle  of  virtually  any 
shape  should  allow  unambiguous  localization,  provided  it  is  large  enough  to  act  as  a  baffle  in 
the  desired  frequency  range.  Certainly,  new  localization  cues  would  have  to  be  learned  if.  for 
example,  an  artificial  head  were  cylindrical  (Wiener.  1947B).  However,  such  a  shape  would 
produce  azimuth-dependent  interaural  intensity  differences,  thus  allowing  unambiguous  local¬ 
ization.  Again,  front-back  ambiguities  will  result  if  the  front  and  back  halves  of  the  obstacle 
are  symmetrical,  so  it  seems  reasonable  to  avoid  this  situation  in  the  choice  of  obstacle  shapes. 

When  subjects  wearing  headphones  are  presented  with  sounds  picked  up  via  a  dummy 
head,  the  effect  is  initially  that  of  lateralization  rather  than  localization.  That  is.  when  subjects 
are  presented  with  dichotic  stimuli  through  headphones,  the  sound  images  are  perceived  as 
originating  somewhere  in  the  head.  However,  when  ;  .ki.  J  to  point  in  the  direction  of  the 
sound  source  in  space.  Molino  ( 1974)  has  shown  that  subjects  can  accurately  localize  the 
source  in  the  azimuthal  plane. 
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DESIGN  CONSIDERATIONS  FOR  AN  UNDERWATER  AUDITORY  SYSTEM 


If  a  dummy  head  system  were  designed  to  provide  monaural  and  binaural  localization 
capabilities  under  water,  it  would,  of  necessity,  be  much  different  from  that  discussed  earlier. 
The  differences  between  hearing  under  water  and  that  in  air  are  governed  by  two  primary  fac¬ 
tors.  First,  the  speed  of  sound  in  water,  and  therefore  the  wavelength  at  any  given  frequency, 
is  increased  by  a  factor  of  five.  Additionally,  the  acoustic  impedance  of  water  is  greater  than 
that  of  air  by  a  factor  of  3000  (Kinsler  and  Frey,  1962).  These  differences  result  in  a  number 
of  disadvantages  for  underwater  localization  as  follows: 

1.  The  low  frequency  range  over  which  the  pinna  is  a  poor  sound  collector  (Shaw, 
1979)  is  increased  by  a  factor  of  five.  In  fact,  the  pinna  contributes  very  little  to  underwater 
localization  (Hollien  and  Feinstein,  1975). 

2.  Interaural  time  differences  resulting  from  differences  in  path  lengths  to  the  two 
ears  are  reduced  by  a  factor  of  five. 

3.  The  ratio  of  interpinna  spacing  to  wavelength  is  decreased  by  the  same  factor. 

4.  The  head  is  small  compared  to  a  wavelength  at  much  higher  frequencies  in  water. 
Therefore,  interaural  intensity  differences  at  any  frequency,  caused  by  the  head’s  baffle  and 
shadow  effects,  are  greatly  reduced. 

5.  Human  underwater  hearing  is  accomplished  primarily  via  bone  conduction 
(Hollien,  1973;  Hollien  and  Feinstein,  1975).  The  impedance  of  the  head  is  much  more  closely 
matched  to  water  than  to  air;  thus,  the  head  does  not  simply  act  as  a  reflector  (Hollien  and 
Feinstein,  1975). 

6.  As  a  result  of  bone  conduction,  the  degree  of  acoustic  isolation  between  the  ears 
is  greatly  reduced. 

All  of  the  factors  which  contribute  to  sound  localization  in  air  are  degraded  in  water. 
These  considerations  lead  to  the  initial  impression  that  underwater  localization  by  a  submerged 
human  head  is  impossible,  a  view  which  was  commonly  held  for  many  years.  However,  several 
studies  have  shown  that  sound  localization  is  possible  under  water,  although  it  is  considerably 
less  accurate  than  in  air  (Feinstein,  1966,  1973  ;  Hollien,  1973).  Subjects  are  quite  accurate  at 
left-right  discriminations,  and  they  can  identify  the  quadrant  in  which  a  sound  source  is 
located,  but  discrimination  between  closely  spaced  sources  is  poor.  Feinstein  (1966)  showed 
that  divers  could  accurately  make  left-right  discriminations  when  sources  were  only  fifteen 
degrees  off  the  midline.  Hollien  (1973)  found  that  low  frequency  and/or  broadband  stimuli 
were  localized  most  accurately  under  water. 

While  the  magnitudes  of  localization  cues  are  much  greater  in  air  than  in  water,  it  has 
been  postulated  that  localization  is  possible  with  cues  of  much  smaller  magnitude  (Feinstein. 
1973:  Hollien,  1973).  Thus,  arrival  time  differences  at  the  ears  may  still  be  sufficient  to  allow 
crude  localization  under  water.  Additionally.  Hollien  postulates  that  localization  based  on 
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intensity  differences  may  be  possible  in  bone-conduction  hearing  due  to  the  angular  depend¬ 
ence  of  the  head’s  sensitivity  to  skull-conducted  sound.  However,  a  greater  degree  of  isolation 
between  the  ears  would  be  required  to  make  maximal  use  of  this  cue.  Norman  et  a!  (1971) 
report  that  the  localization  ability  of  some  fish  is  enhanced  by  the  fact  that  their  hearing 
mechanisms  are  acoustically  isolated  via  cartilage.  Such  isolation  is  found  in  the  hearing 
mechanisms  of  cetacea,  which  exhibit  excellent  auditory  localization  capabilities. 

In  view  of  the  role  of  acoustical  impedance  mismatch  between  water  and  head,  the 
choice  of  materials  in  the  design  of  a  dummy  head  becomes  very  critical.  Surface  impedance 
may  not  be  ignored,  thus  complicating  the  design  as  compared  to  that  for  dummy  heads  in 
air.  If  the  pinnae  are  to  contribute  any  directionality  to  the  receiving  system,  they  must  be 
greatly  enlarged  relative  to  their  counterparts  in  air.  Batteau  and  Plante  (1962)  and  Batteau 
(1963)  have  had  reasonable  success  with  underwater  localization  using  an  enlarged  pinna. 
Furthermore,  it  would  be  beneficial  to  enlarge  the  head,  both  to  increase  interpinna  spacing 
and  to  increase  the  obstacle  effect  of  the  head  at  moderate  frequencies.  A  head  of  this  type 
should  also  be  made  of  several  parts  which  are  acoustically  isolated  from  one  another. 

The  above  considerations  suggest  that  it  is  probably  impractical  to  design  an  under¬ 
water  localization  system  based  solely  on  an  enlarged  dummy  head.  An  alternative  approach 
worth  investigating  is  to  design  an  electronic  system  which  would  measure  and  then  augment 
differences  in  interaural  time,  phase,  and  intensity.  For  example,  the  intensity  difference  be¬ 
tween  receivers  of  a  two-channel  system  could  be  monitored,  and  this  information  used  to 
control  a  feedback  network  to  two  amplifiers  having  independent  gain  controls.  Likewise, 
present  technology  should  allow  implementation  of  an  adaptive  delay  line  to  augment  arrival 
time  differences  between  the  ears. 
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UNAIDED  HUMAN  ECHOLOCATION 


It  has  been  known  for  many  years  that  blind  humans  often  possess  the  ability  to 
detect  and  avoid  obstacles  without  actually  contacting  them.  The  nature  of  this  obstacle 
sense,  which  has  been  misnamed  “facial  vision”  in  much  of  the  literature,  was  not  known, 
however,  until  very  recently.  Subjects  described  the  perception  as  being  similar  to  a  veil  or 
curtains  near  the  face  (Kohler,  1966;  Ammons,  et  al.,  1953).  Investigations  reported  in  three 
papers  by  researchers  at  Cornell  University  finally  showed  conclusively  that  audition  was 
both  a  necessary  and  sufficient  condition  for  the  perception  of  obstacles  (Supa.  et  al.,  1944; 
Worchel,  et  al.,  1947;Cotzin,  et  al.,  1950).  Furthermore,  none  of  the  other  possible  cues  in¬ 
cluding  eardrum  pressure,  temperature  changes,  or  various  tactile  sensations  were  found  to 
be  necessary  or  sufficient  to  elicit  the  perception. 

Subsequent  investigators  have  sought  to  quantify  this  obstacle  sense  in  terms  of  what 
can  be  discriminated,  within  what  ranges  is  the  perception  operable,  and  what  types  of  audi¬ 
tory  signals  are  best  suited  to  the  perception.  It  was  found  that  the  perception  was  neither 
limited  to  the  blind  to  laboratory  environments  (Kohler,  1964;  Ammons,  et  al.,  1953). 
Sighted  subjects  who  were  blindfolded  performed  as  well  as  blind  subjects  after  considerable 
practice.  Additionally,  learning  was  found  to  be  sudden  and  insightful,  implying  that  subjects 
needed  merely  to  recognize  the  existence  of  a  previously  unused  perception  (Ammons,  et  al.. 
1953).  It  was  demonstrated  that  blind  and  blindfolded  subjects  performed  detection  and  dis¬ 
crimination  tasks  with  the  same  degree  of  accuracy,  whether  in  the  laboratory  or  in  a  real- 
world,  outdoor  environment  (Ammons,  et  al.,  1953). 

Various  studies  have  sought  to  determine  what  types  of  signals  are  optimal  for  obstacle 
perception  (Cotzin,  1942;  Cotzin,  et  al.,  1950;  Kohler,  1964;  Rice,  1966A,  1966B).  While 
traveling,  blind  subjects  use  various  types  of  self-generated  stimuli,  including  clicks,  hisses, 
and  the  sound  of  footsteps  to  elicit  the  echo  perception.  Additionally,  obstacles  can  be  de¬ 
tected  passively  through  the  localization  of  echoes  resulting  from  background  noise  (Kohler. 
1964;  Rice,  1966A).  Self-generated  signals,  however,  were  found  to  yield  better  performance 
(Kohler,  1964).  Rice  (1966B)  measured  performance  on  various  detection  and  discrimination 
tasks,  allowing  subjects  to  generate  whatever  signal  they  wished.  He  found  that  subjects  were 
equally  divided  in  their  choice  of  signals,  with  half  using  clicks  and  half  using  hisses.  There 
were  no  reliable  performance  differences  between  groups  using  the  different  signals,  and  most 
subjects  showed  no  change  in  performance  when  asked  to  use  the  nonpreferred  sound.  It 
appears  to  make  no  difference  whether  or  not  the  transmitted  signal  overlaps  the  echo  (Rice. 
1966B). 

Broadband  signals,  whether  pulsed  or  continuous,  are  better  suited  to  echolocation 
than  are  narrowband  or  tonal  signals  (Griffin,  1 958;  Basset  and  Eastmond.  1964;  Rice.  1966B). 
Large  time-bandwidth  products  provide  more  signal  energy,  thus  making  the  echoes  more 
detectable  (Green.  I960).  In  two  studies  reported  by  Cotzin  (1942;  Cotzin,  et  al..  1950), 
subjects  detected  obstacles  very  accurately  using  thermal  noise  but  were  unable  to  perceive 
the  obstacles  using  tones  with  frequencies  less  than  8  kHz.  Cotzin  reported  that  perception 
was  quite  good  using  a  10-kHz  pure  tone,  but  as  Griffin  (1958)  suggests,  the  performance 
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improvement  may  have  resulted  from  distortion  of  the  signal  since  10  kHz  approached  the 
upper  frequency  limit  of  C'otzin’s  electronic  equipment.  Basset  and  Eastmond  (1964)  also 
report  unreliable  echo  perceptions  using  tonal  stimuli,  due  to  the  excitation  of  standing  waves 
in  the  test  room. 

The  basis  of  the  obstacle  perception  seems  to  be  associated  with  a  rise  in  perceived 
pitch  as  obstacles  are  approached  (Cotzin,  1942;  Basset  and  Eastmond,  1964;  Wilson,  1966). 
Loudness  changes  were  found  not  to  be  sufficient  for  obstacle  detection  (Cotzin,  1942),  al¬ 
though  they  may  be  valuable  for  making  size  discriminations  based  on  target  strength  (Rice, 
and  Feinstein,  1965).  The  perceived  rise  in  signal  pitch  as  obstacles  are  approached  may  be 
related  to  the  phenomenon  of  time-separation  pitch  (Basset  and  Eastmond,  1964;  Small, 
et  al.,  1963;  Yost  and  Hill,  1978).  When  two  broadband  stimuli  are  highly  correlated,  and  one 
is  delayed  relative  to  the  other,  a  pitch  is  perceived  whose  frequency  is  equal  to  the  reciprocal 
of  the  time  delay. 

Numerous  investigators  have  sought  to  determine  what  types  of  detections  and/or  dis¬ 
criminations  can  be  made  via  echolocation.  While  large  individual  differences  exist  among 
subjects’  ability  to  use  the  obstacle  sense,  it  has  been  found  that  the  performance  of  even  the 
best  subjects  is  far  inferior  to  that  achievable  by  some  animals,  or  by  the  use  of  auditory  sen¬ 
sory  aids  employing  ultrasonic  frequencies  and  directional  beam  patterns.  The  following  is  a 
summary  of  detections  and  discriminations  made  by  the  best  subjects  using  self-generated 
signals. 


1.  Size  and  distance.  Subjects  can  detect  large  objects  at  distances  exceeding  five 
meters  (Griffin,  1958).  The  sizes  of  just-detectable  objects  are  highly  correlated  with  distance 
(Rice,  Feinstein  and  Schusterman,  1965;  Rice,  1966B;  Rice,  1967).  At  close  ranges,  disks 
with  diameters  as  small  as  27  mm  have  been  detected  (Rice,  1966B).  The  solid  angle  which 
an  object  must  subtend  in  order  to  be  just  detectable  was  found  to  be  4.6  degrees  and  was 
relatively  independent  of  distance  (Rice,  Feinstein  and  Schusterman,  1965).  Subjects  have 
been  able  to  discriminate  between  objects  of  different  sizes  having  area  ratios  as  small  as 
1.07/1,  with  differences  in  intensity  serving  as  the  primary  discrimination  cue  (Rice  and 
Feinstein,  1965). 

2.  Depth  perception.  Kellogg  ( 1 962)  reports  that  his  best  subject  could  accurately 
discriminate  movement  of  4.3  inches  of  a  one-foot  disk  located  two  feet  in  front  of  him. 

3.  Shape  discriminations.  Rice  (1966B)  demonstrated  that  with  practice,  most  of  his 
subjects  could  accurately  identify  squares,  ci;  Jes,  and  triangles:  all  objects  having  the  same 
area.  He  also  found  that  squares  and  circles  were  more  detectable  than  a  rectangle  having  a 
16/1  edge  ratio.  The  detectability  of  the  rectangle  did  not  change  as  a  function  of  whether  it 
was  oriented  horizontally  or  vertically.  The  difficulty  in  detecting  the  rectangle  was  shown 
to  result  from  a  loss  of  echo  intensity  in  the  expected  direction. 

4.  Texture  and  material  composition.  Kellogg  (1962)  showed  that  subjects  could  dis¬ 
criminate  accurately  between  objects  of  the  same  size  made  of  metal,  wood,  denim  cloth,  and 
velvet.  Subjects  reported  that  the  various  objects  simply  sounded  different.  Rice  (1966B) 
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suggests  that  these  discriminations  may  have  been  based  on  differences  in  target  strength. 
Subject  performance  was  at  change  level  when  attempting  to  discriminate  between  painted 
and  unpainted  wood,  or  between  metal  and  glass  (Kellogg,  1962). 

While  the  types  of  echo  perception  discussed  in  this  section  are  extremely  valuable  to 
blind  travelers,  by  no  means  can  they  provide  a  complete  spatial  representation  of  the  environ¬ 
ment.  Four  major  factors  limit  the  usefulness  of  the  obstacle  sense. 

1.  People  are  limited  in  the  types  of  signal  which  they  can  produce.  As  will  be  dis¬ 
cussed  in  the  next  section,  certain  advantages  can  be  gained  with  continuous-transmission  FM 
signals  and  broadband  pulsed  signals. 

2.  The  long  wavelengths  associated  with  the  range  of  human  hearing  prevent  the  return 
of  a  sharp  sound  image,  i.e.,  fine  features  of  objects  cannot  be  discriminated. 

3.  People  cannot  transmit  a  directional  beam  pattern  such  that  only  a  small  angular 
region  would  be  ensonified. 

4.  Because  of  the  large  impedance  mismatch  between  air  and  most  surfaces  of  interest, 
only  initial  reflections  from  the  front  of  objects  are  returned ;  that  is,  internal  object  structure 
cannot  generally  be  discriminated  in  air. 

The  design  of  an  active  sonar  system  allows  a  choice  of  signals  and  transmitted  beam 
patterns,  as  well  as  allowing  the  use  of  ultrasonic  frequencies,  so  that  the  fine  features  of  a 
target  would  be  large  compared  to  a  wavelength.  Additionally,  many  of  the  disadvantages  of 
underwater  hearing  become  advantages  in  the  design  of  active  underwater  sonar  systems.  That 
is,  the  acoustic  impedance  of  water  more  closely  matches  that  of  many  solids,  so  that  object 
echoes  can  contain  information  about  internal  structure.  These  topics  are  discussed  in  the  next 
section  on  active  sonar  design. 
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ACTIVE  SONAR  SYSTEMS 


Since  the  Second  World  War,  a  large  number  of  sonar  sensory  aids  have  been  developed 
to  enhance  blind  mobility,  e.g.,  (Witcher  and  Washington,  1954;Twersky,  1948;Beurle,  1951). 
Most  of  these  aids  have  been  designed  solely  for  uetection  of  nearby  obstacles.  That  is,  they 
produce  an  audible  signal  when  an  obstacle  is  near,  but  they  provide  no  information  about 
the  nature  of  the  obstacle  (Russell,  1966).  A  single  exception  is  a  series  of  sensory  aids  devel¬ 
oped  by  Kay  (1962,  1966,  1979),  which  allow  binaural  localization  of  echoes,  estimation  of 
distance,  and  certain  types  of  object  discriminations  (Kay,  1973). 

All  of  the  sonar  aids  which  provide  reliable  information  use  directional  beam  patterns, 
and  broadband,  ultrasonic  signals  which  are  either  heterodyned  or  digitally  stretched  into  the 
audible  frequency  range.  The  signals  used  have  either  been  broadband  pulses  (clicks)  or  con¬ 
tinuous  transmission  FM  (CTFM)  signals. 

In  addition  to  in-air  sonars,  considerable  work  has  been  done  to  investigate  the  sonar 
capabilities  of  marine  mammals  and  humans  using  dolphin-like  signals  (Au  and  Hammer, 
1978A,  1978B);  Nachtigall,  et  al.,  1978;  Fish,  et  al.,  1976).  Much  of  the  information  from 
these  investigations  can  be  applied  to  the  design  of  underwater  sonars  for  divers.  Because  the 
acoustic  impedance  of  water  closely  matches  that  of  many  solids,  discriminations  of  material 
composition  and  internal  object  structure,  which  are  impossible  in  air,  become  quite  easy  us¬ 
ing  underwater  sonars.  The  advantages  of  various  types  of  signals  and  beam  patterns,  as  well 
as  the  types  of  aural  discriminations  which  are  possible  using  high  resolution  sonars  in  air  and 
water  will  be  discussed  in  the  following  paragraphs. 

A  vast  amount  of  literature  has  appeared  on  the  subject  of  optimum  signal  design  for 
active  sonars.  Researchers  have  sought  to  determine  whether  pulsed  or  continuous  waveforms 
are  optimal,  what  the  effect  is  of  varying  the  pulse  shape  or  sweep  rate,  and  what  types  of 
signals  are  most  resistent  to  various  types  of  interference.  The  obvious  result  of  these  investi¬ 
gations  is  that  the  choice  of  signals  depends  on  the  application:  whether  detection  or  discrim¬ 
ination.  absolute  range  or  range  resolution,  or  doppler  information.  Different  signals  are 
optimum  for  returning  information  about  detection,  discrimination,  absolute  range,  etc. 

For  any  given  criterion  and  time  interval,  it  is  generally  true  that  CTFM  signals  yield 
a  higher  probability  of  detection  than  do  pulsed  signals  (Kay,  1960).  Pulsed  signals,  on  the 
other  hand,  yield  better  range  resolution,  i.e.,  the  number  of  objects  which  can  be  discrimi¬ 
nated  in  a  given  range,  or  the  number  of  details  about  an  object  which  can  be  discriminated 
(Kay,  1960).  CTFM  signals  are  probably  superior  for  determining  the  absolute  range  to  a 
target  and  the  rate  of  target  motion. 

If  swept  FM  signals  are  used,  target  detection  and  range  information  can  be  gained  by 
a  simple  multiplication  of  the  transmitted  and  received  signals  to  produce  a  difference  fre¬ 
quency  (Kay,  1979).  The  portion  of  the  sweep  cycle  at  which  the  signal  is  received  is  a  func¬ 
tion  of  the  sweep  rate  and  the  two-way  travel  time  of  the  signal,  i.e.,  the  target  range.  Thus, 
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if  the  target  is  a  stationary  point  reflector,  the  output  of  a  system  which  multiplies  the  trans¬ 
mitted  and  received  signals  will  be  a  tone  whose  frequency  is  proportional  to  the  target 
distance.  The  fact  that  the  system  output  is  tonal  causes  it  to  sound  very  different  from  back¬ 
ground  noise,  thus  resulting  in  easier  initial  detections  for  CTFM  sonars  as  compared  to  pulsed 
sonars  (Kay,  1960).  The  bandwidth  and  sweep  rate  of  the  transmitted  signal  can  be  adjusted 
so  that  frequencies  associated  with  different  ranges  can  be  resolved.  A  very  slow  sweep  rate 
would  not  allow  fine  distance  discriminations  at  close  range,  whereas  a  very  rapid  sweep  rate 
would  not  allow  long-range  discriminations.  In  the  design  of  a  CTMF  system,  it  would  be 
beneficial  to  include  a  range  selector  to  vary  the  sweep  rate. 

Kay  (1979)  has  designed  systems  with  auditory  displays  which  are  operable  over 
different  ranges,  depending  on  the  needs  of  the  user.  These  systems  have  been  designed  so 
that  the  received  frequency  decreases  with  decreasing  distance  to  the  target.  The  opposite 
approach,  associating  high  frequencies  with  close  ranges,  caused  the  signals  to  become  nearly 
inaudible  at  the  point  of  closest  approach  to  an  obstacle. 

The  output  of  Kay’s  CTFM  sonar  is  a  pure  tone  only  if  the  target  is  stationary  and 
contains  no  features  which  are  large  compared  to  a  wavelength.  If  the  target  is  in  motion  the 
system  output  is  a  time-varying  tonal  complex,  and  if  the  target  has  features  of  shape  or  tex¬ 
ture  which  are  large  compared  to  a  wavelength  a  complex  tonal  structure  is  heard  (Riley. 
1966;  Kay,  1979).  Although  the  frequency  resolution  of  the  auditory  system  required  to  dis¬ 
criminate  between  stationary  targets  is  quite  poor,  it  has  been  demonstrated  that  if  targets 
are  textured  or  in  motion,  discrimination  is  vastly  improved  (Do  and  Kay,  1976-77;  Kay  and 
Do,  1976-77).  The  received  complex  stimuli,  which  are  associated  with  various  classes  of  ob¬ 
jects,  can  be  remembered  and  in  many  cases  can  be  generalized  to  include  new  objects  of  a 
class. 


Another  beneficial  aspect  of  the  Kay  sonar  aid  is  binaural  presentation  of  spatial  in¬ 
formation,  allowing  simultaneous  discrimination  of  multiple  targets  (Kay,  1979).  With  a  nar¬ 
row  beam  pattern,  the  environment  can  be  scanned  to  selectively  ensonify  various  targets  or 
to  determine  target  dimensions  (Riley,  1966).  With  a  broad  beam  pattern,  multiple  target 
information  can  be  extracted  from  the  complex  tonal  display,  and  directionality  can  often 
be  interpreted  via  intensity  differences  in  the  two  channels.  Although  directional  information 
has  been  coded  into  binaural  intensity  differences,  such  coding  is  range-dependent  (Kay, 
1979).  Thus,  new  localization  cues  must  be  learned  when  using  these  systems. 

If  the  sonar  has  a  narrow  beam  pattern,  fine  discriminations  can  be  made  by  scanning, 
but  only  radial  motion  can  be  perceived.  With  a  wide  beam  pattern,  object  motion  can  easily 
be  perceived,  but  azimuth  resolution  is  very  poor.  Multiple  targets  will  always  be  present  in 
the  field  of  view,  whereas  a  narrow  beam  sonar  requires  scanning  to  determine  an  object’s 
position.  It  would  certainly  be  desirable  to  design  a  system  which  allows  selection  of  either  a 
narrow  or  broad  beam  pattern.  A  large  portion  of  the  environment  could  be  viewed  with  the 
wide  beam,  and  then  fine  details  could  be  discriminated  by  scanning  with  the  narrow  beam. 
However,  the  design  of  transducer  arrays  to  implement  such  a  system  is  by  no  means  trivial. 
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The  greatest  advantage  of  the  Kay  sonar  aids  over  other  air  sonars  is  gained  in  multi¬ 
ple  target  environments  with  objects  in  motion.  Binaural  presentation  of  spatial  information 
allows  users  to  perceive  motion,  and  often  they  can  learn  to  separately  localize  multiple  tar¬ 
gets.  Because  the  binaural  cues  are  new,  however,  subjects  require  considerable  practice  before 
they  can  accurately  translate  the  lateralized  stimuli  into  three-dimensional  spatial  perception. 
Blind  and  blindfolded  subjects  have  had  considerable  success  at  navigating  obstacle  courses, 
discriminating  walls  from  fences,  and  identifying  various  objects  in  the  environment  such  as 
cars  and  parking  meters  (Riley,  1966;  Kay,  1973;  Kay,  1979).  Although  most  objects  produce 
unique  and  discriminable  echoes,  the  identification  of  most  objects  requires  context  cues.  For 
example,  a  parking  meter  would  probably  not  be  identified  as  such  in  an  environment  where 
parking  meters  are  not  generally  found. 

The  range  resolution  capabilities  of  broadband,  ultrasonic,  pulse  mode  sonars  would 
probably  be  quite  useful  for  object  identifications  in  air.  However,  the  use  of  broadband  pul¬ 
ses  as  sonar  signals  has  never  been  tried  in  air.  The  following  discussion  therefore  relates  the 
use  of  pulsed  signals  for  echolocation  by  dolphins  and  humans  under  water.  It  is  difficult  to 
generalize  these  results  to  air  environments,  where  only  front  surface  reflections  would  be 
returned.  However,  many  bats  using  pulsed  signals  have  displayed  nearly  optimal  performance 
on  complex  echo  discrimination  tasks. 

Dolphins,  using  very  short  broadband  clicks,  have  demonstrated  the  ability  to  discrim¬ 
inate  among  targets  of  various  shapes,  sizes,  and  material  compositions,  as  well  as  various 
aspects  of  internal  target  structure.  The  following  is  a  brief  summary  of  some  of  the  discrim¬ 
inations  which  have  been  accomplished. 

1 .  Size.  Dolphins  have  discriminated  between  solid  steel  spheres  at  close  range  with 
100%  correct  responses  when  the  spheres  had  diameters  of  5.40  and  6.35  cm  (Norris,  et  al., 
1966).  Performance  dropped  to  77%  correct  when  the  diameters  were  5.7  and  6.35  cm. 
Differences  in  echo  intensity  are  probably  the  most  salient  discrimination  cues.  Performance 
was  near  100%  correct  when  discriminating  between  hollow  aluminum  cylinders  having  dia¬ 
meters  of  7.6  and  6.35  cm  (Au  and  Hammer,  1978A). 

2.  Shape.  Dolphins  have  successfully  discriminated  between  cylinders  and  cubes, 
independent  of  target  aspect,  except  for  cases  in  which  the  flat  top  of  the  cylinder  was  facing 
the  animal  (Nachtigall,  et  al.,  1978).  These  discriminations  are  based  on  angular  variations  in 
target  strength,  as  well  as  perception  of  multiple  echoes  from  surfaces  meeting  at  an  edge 
(Fish,  et  al.,  1976;  Nachtigall,  et  al.,  1978). 

3.  Material  composition.  Discrimination  with  80%  correct  responses  or  higher  has 
been  demonstrated  between  dimensionally  identical  cylinders  in  the  following  cases:  alumi¬ 
num  vs.  rock,  aluminum  vs.  steel,  and  aluminum  vs.  bronze  (Au  and  Hammer,  1 978A).  On  a 
discrimination  task  involving  aluminum  and  glass  cylinders,  performance  varied  from  80% 
correct  to  chance  as  target  size  was  increased. 
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4.  Internal  object  structure.  Discrimination  between  hollow  and  solid  aluminum  cyl¬ 
inders  has  been  shown  to  be  quite  easy,  with  performance  near  100%  correct.  Additonally, 
dolphins  have  discriminated  between  aluminum  cylinders  having  differences  in  wall  thickness 
of  1.6  mm  (Au  and  Hammer,  1978A).  Smaller  differences  in  wall  thickness  could  probably 
be  discriminated,  but  difference  thresholds  were  not  measured  in  the  study  cited. 

Many  attempts  have  been  made  to  understand  the  auditory  cues  which  dolphins  may 
use  to  make  echo  discriminations.  The  time  series  associated  with  the  echo  from  a  target  gen¬ 
erally  involves  an  initial  high-amplitude  peak  associated  with  the  front  surface  reflection.  A 
series  of  secondary  reflections  follows  with  decreasing  amplitudes.  In  the  case  of  cylinders, 
these  reflections  result  from  various  transverse,  circumferential,  and  square  acoustic  paths 
through  the  material  (Shirley  and  Diercks,  1970).  Subsequent  reflections  along  a  given  path 
will  be  periodic  with  successively  decreasing  amplitudes.  When  targets  are  of  very  different 
sizes  or  materials  with  large  differences  in  acoustic  properties,  discriminations  can  probably 
be  made  based  on  the  time  series  alone.  However,  when  targets  are  of  similar  size  and  compo¬ 
sition,  the  time  waveforms  do  not  allow  unambiguous  discriminations  (Au  and  Hammer. 
1978B).  Likewise,  the  power  spectrum  generally  does  not  supply  sufficient  information  to 
make  the  complex  discriminations  which  have  been  demonstrated.  Au  and  Hammer  ( 1978B) 
have  found  a  high  degree  of  correspondence  between  the  animals’  discrimination  performance 
and  a  scheme  which  processes  the  echoes  through  a  filter  whose  transfer  function  is  the  com¬ 
plex  conjugate  of  the  transmitted  signal.  That  is,  the  matched  filter  responses  for  targets  which 
were  easily  discriminable  could  be  discriminated  visually  with  little  problem.  However,  for 
targets  which  were  difficult  to  discriminate  the  matched  filter  responses  were  very  similar. 

Two  major  problems  with  the  assumption  that  matched  filters  are  actually  used  arise 
because  the  transmitted  signals  for  a  given  animal  differ  considerably  from  click  to  click.  The 
filter  would  have  to  be  adaptive  since  no  “optimum”  signal  seems  to  be  used.  Secondly,  the 
hypothesis  requires  that  the  animal  devote  a  great  deal  of  memory  to  the  storage  of  exact 
waveforms  (Johnson  and  Titlebaum,  1976).  However,  an  analysis  very  similar  to  matched  fil¬ 
ters  has  been  invoked  to  explain  the  phenomenon  of  time-separation  pitch  (Johnson  and 
Titlebaum,  1976).  It  seems  likely  that  this  mechanism  of  pitch  perception  associated  with  the 
reciprocal  of  the  time  interval  between  correlated  reflections  might  help  to  explain  the  mech¬ 
anisms  of  echolocation. 

Most  of  the  discriminations  discussed  above  would  not  be  possible  using  CTFM  signals 
unless  a  bandwidth  several  times  that  of  the  pulsed  signals  was  used.  The  echo  information  re¬ 
quired  for  discrimination  is  separated  in  time  by  only  a  few  microseconds  (Au  and  Hammer. 
1978B),  and  the  frequency  separation  necessary  to  provide  this  degree  of  range  resolution 
with  CTFM  signals  would  not  be  practically  achievable.  Additionally,  if  an  extremely  rapid 
sweep  rate  were  used,  so  that  closely-spaced  echoes  were  resolvable  as  separate  frequencies, 
the  sonar  would  be  of  little  use  for  ranges  greater  than  the  diameter  of  the  target. 

Although  not  much  information  is  available  concerning  human  listeners  using  under¬ 
water  pulsed  sonars,  several  studies  have  shown  that  human  performance  is  often  as  good  as 
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and  in  some  cases  better  than  that  of  dolphins.  Fish,  et  al.  (1976)  trained  divers  to  discrimi¬ 
nate  between  various  plates  one  meter  in  front  of  them,  using  a  head-coupled  sonar  and  digi¬ 
tally  stretched  ultrasonic  pulses  which  were  produced  on  the  surface.  Subjects  discriminated 
between  plates  which  varied  in  shape  (squares,  circles,  and  triangles),  material  composition 
(copper,  brass,  or  aluminum)  and  thickness.  The  performance  of  the  divers  was  between  80 
and  100%  correct,  and  was  in  some  instances  approximately  10%  better  than  that  of  dolphins 
on  the  same  task.  A  subject  outside  the  diving  tank  who  monitored  the  sonar  signals  through 
headphones  performed  very  poorly  on  all  discrimination  tasks  because  he  could  not  coordinate 
the  stimuli  with  the  diver’s  scanning  behavior  (Fish,  et  al.,  1976). 

Studies  presently  underway  at  the  Naval  Ocean  Systems  Center,  Hawaii  Laboratory, 
have  investigated  the  skill  with  which  humans  can  discriminate  between  various  types  of 
cylinders  differing  in  composition,  size,  and  wall  thickness.  Subjects  wearing  headphones  in 
a  sound  booth  have  been  asked  to  discriminate  between  prerecorded  echoes  from  the  various 
cylinders.  The  echoes  have  been  digitally  stretched  to  have  center  frequencies  of  approximate¬ 
ly  2.2  kHz.  Under  these  favorable  listening  conditions,  subjects  have  been  able  to  make  the 
following  discriminations  with  nearly  100%  correct  responses. 

1 .  Solid  rock  vs.  hollow  aluminum, 

2.  Solid  vs.  hollow  aluminum, 

3.  Hollow  aluminum  —  7.62  vs.  3.18  cm  diameter, 

4.  Hollow  aluminum  vs.  steel, 

5.  Hollow  aluminum  vs.  bronze. 

Subjects  have  also  demonstrated  the  ability  to  discriminate  between  echoes  from  cylinders 
which  differ  only  in  wall  thickness  when  the  difference  in  wall  thicknesses  was  1.6  mm. 
Finally,  human  subjects  have  performed  at  nearly  85%  correct  when  discriminating  between 
hollow  aluminum  and  glass  cylinder  echoes.  Dolphins  performed  near  chance  level  on  a  similar 
task. 


The  most  salient  cue  in  the  aluminum-glass  discrimination  task  seems  to  be  a  duration 
difference  between  echoes.  Reflections  from  the  glass  cylinders  tend  to  damp  out  more  quick¬ 
ly  than  those  from  the  aluminum.  For  cylinders  having  diameters  of  7.6  cm,  this  duration 
difference  was  on  the  order  of  150  microseconds  and  was  apparently  not  perceptible  to  the 
dolphin.  However,  when  the  signals  are  digitally  stretched  the  duration  difference  is  on  the 
order  of  7.5  milliseconds.  This  is  readily  perceptible  to  the  human  subjects  as  an  additional 
hiss  associated  with  the  aluminum  targets. 

It  is  probable  that  differences  in  time-separation  pitch  between  periodic  and  highly 
correlated  reflections  from  the  targets  provide  humans  with  another  salient  discrimination 
cue.  The  perception  of  time-separation  pitch,  in  conjunction  with  duration  cues,  seem  to  be 
the  bases  for  the  human  discrimination  performance  in  the  above  studies.  In  addition  to  more 
favorable  listening  conditions,  and  the  ability,  to  use  cues  associated  with  stretching  of  the 
signals,  human  subjects  have  o,.e  other  obvious  advantage  over  dolphins  on  the  discrimination 
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tasks  discussed:  Training  is  easier  and  more  precise  since  humans  can  understand  verbal  in¬ 
structions.  Through  instructions,  they  can  readily  learn  to  categorize  a  large  number  of  signals 
into  appropriate  classes.  A  simple  explanation  is  often  sufficient  to  teach  subjects  to  ignore 
obvious  but  irrelevant  cues. 

It  seems  clear  from  data  cited  in  this  and  the  previous  section  that  a  great  deal  of  in¬ 
formation  could  be  gained  by  equipping  a  remote  sensing  system  with  active  and  passive 
echolocation  capabilities.  While  considerable  flexibility  is  available  in  the  design  of  an  air  or 
water  sonar,  the  following  general  guidelines  will  aid  the  resolution  and  overall  utility  of  the 
system. 


Any  high  resolution  sonar  must  utilize  broadband,  ultrasonic  signals.  Sonars  employ¬ 
ing  pure  tones  are  only  capable  of  returning  detection  and  range  information.  The  use  of  ultra¬ 
sound  permits  the  return  of  a  sharp  object  image,  i.e.,  only  object  features  which  are  large 
compared  to  a  wavelength  can  be  discriminated.  In  addition,  high  frequency  signals  can 
generally  be  transmitted  with  smaller  and  more  directional  transducers  than  can  low  frequency 
signals. 


The  use  of  a  narrow  beam  pattern  allows  fine  discrimination  via  scanning,  whereas  a 
wide  beam  pattern  will  ensonify  a  large  area  and  consequently  enhance  motion  perception. 

It  would  probably  be  desirable  to  design  a  system  which  could  employ  either  of  two  beam- 
widths.  The  wide  beam  could  be  used  initially  to  determine  if  any  relevant  information  will 
be  returned  from  a  given  sector,  and  the  narrow  beam  then  could  be  used  to  scan  for  details 
if  objects  of  interest  are  detected.  Furthermore,  the  system  should  allow  the  operator  to 
select  the  maximum  range  of  operation.  A  system  which  is  designed  for  long-range  detection 
will  give  very  poor  depth  perception  for  nearby  objects. 

Finally,  a  system  utilizing  an  auditory  display  should  be  binaural,  and  should  be  cou¬ 
pled  to  the  head  rather  than  hand-held.  When  scanning  with  a  head-mounted  system,  the  op¬ 
erator  will  be  able  to  coordinate  accurately  the  direction  of  the  transmitted  beam  with  other 
auditory  and  visual  sensory  information.  If  the  system  is  not  head-mounted,  it  is  difficult  to 
correlate  scanning  behavior  with  other  localization  cues. 
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